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The distillation of magic states is an often-cited technique for enabling universal quantum 
computing once the error probability for a special subset of gates has been made negligible by 
other means. We present a routine for magic-state distillation that reduces the required overhead 
for a range of parameters of practical interest. Each iteration of the routine uses a four-qubit error- 
detecting code to distill the -1-1 eigenstate of the Hadamard gate at a cost of ten input states per 
two improved output states. Use of this routine in combination with the 15-to-l distillation routine 
described by Bravyi and Kitaev allows for further improvements in overhead. 



Many techniques for robustly implementing quantum 
gates most naturally generate only a finite subset of the 
unitary operators. Frequently, the naturally convenient 
quantum operations generate the full set of Clifford op- 
erations, which consists of the Clifford group of unitaries 
augmented by measurement and state preparation in 
the standard basis. Clifford operations are sufficient for 
stabilizer-state preparations and measurements and thus 
underlie stabilizer-based error correction and much of the 
associated theory of fault tolerance. Though inadequate 
for universal quantum computing, the Clifford operations 
can be supplemented by any unitary outside of the 
Clifford group to obtain a universal set [1]. Consequently, 
the problem of achieving universality is often reduced to 
that of finding a way of robustly implementing a single 
non-Clifford unitary gate. 

Given the ability to perform Clifford operations, non- 
Clifford gates can be indirectly implemented using cer- 
tain non-stabilizer states as a consumable resource. The 
advantage of this approach lies in the possibility of 
distilling such resource states prior to use. Distillation 
is a technique whereby a collection of independently 
prepared faulty resource states can be converted into 
a smaller number of resource states whose fidelity with 
respect to the ideal state is higher. Some states have 
the property that one can distill them using only Clifford 
operations. States that are both sufficient for universality 
and distillable in this way are known as magic states. 
Magic-state distillation allows faulty magic states to 
be used as a resource for robust universal quantum 
computing. 

The notion of magic states was introduced by Bravyi 
and Kitaev [2] , who showed that the (magic) eigenstates 
of the one-qubit Clifford gates T and H can be distilled 
from copies of these states with error probabilities of up 
to 0.173 and 0.141 per state, respectively. Their dis- 
tillation routines work by projecting several such faulty 
copies of a specified magic state (henceforth, resource 
state) into a stabilizer code and then decoding the result, 
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checking for and discarding on any indication of error. 
Distillation of the T-eigenstate |T) employs a projection 
onto the 5-qubit distance-3 code, while distillation of 
the i?-eigenstate relies on the 1 5-qubit Reed-MuUer 
code; both distillation routines result in one improved 
resource state. We refer to the -distillation routine as 
the 15-to-l routine. 

An apparently distinct routine for distilling \H) using 
the 7-qubit Steane code was proposed previously by 
Knill but Reichardt found the two routines to be 
equivalent [4]. Reichardt additionally showed that the 
error threshold for distilling \H) could be improved 
from 0.141 to 0.146 via a 7-to-l distillation routine, 
thereby proving that every faulty \H) outside of the set 
of stabilizer states is distillable with a finite routine. 
Campbell and Browne proved the impossibility of a 
similar result for |T) by showing that no finite distillation 
routine is capable of distilling faulty |T) arbitrarily near 
the boundary of the stabilizer states [•'), (>]. 

The focus of each of the aforementioned papers is 
on the threshold for magic-state distillation, but the 
efficiency of a distillation routine is crucial to its prac- 
tical utility. Of particular concern is the number of 
faulty resource states required as input to distill each 
resource state of some desired quality. This ratio con- 
tributes strongly to the overhead required to implement 
a quantum computation using magic-state distillation [ ], 
potentially increasing the number of qubits and gates 
required by a large multiplicative factor. With this 
in mind, we describe a routine for distilling \H) that 
reduces the number of input resource states required 
per output state, distilling 2 improved resource states 
from 10. The routine can be used either solely or in 
combination with previously developed routines to obtain 
resource reductions for a variety of parameter ranges of 
interest. 

After explaining the needed background in Sec. I, we 
introduce and analyze the proposed distillation routine 
in Sees. II and III and compare it to the 15-to-l routine 
in Sec. V. Sec. IV explains how sequential distillation 
rounds can be combined. Concluding remarks appear in 
Sec. VI. 



I. BASIC CONCEPTS AND NOTATION 

We denote the one-qubit Pauh operators by /, X, 
Y, and Z, where / is the identity and the others are 
i times the conventional 7r-rotations associated with the 
eponymous axes of the Bloch sphere. The n-qubit Pauli 
group consists of all n-fold tensor products of the one- 
qubit Pauli operators multiplied by {±l,±z}. 

The Clifford group consists of the unitary operators 
that normalize the Pauli group. Any Clifford unitary can 
be constructed by composition of tensor products of H 
(Hadamard), T, and '~^X (controUed-NOT or controlled-X) 
gates up to an unimportant global phase. In the standard 
basis, these gates are given by 
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For convenience, we additionally employ the following 
one- and two-qubit Clifford gates: P(^) = e^F'^'^/* 
where P E {X,Y,Z} {it/ 2 rotations about the axes of 
the Bloch sphere), S = e}-^I^Z{\), '^Z (controlled-Z), 

'~^Y (controUed-F) , and X (SWAP). We also make frequent 
use of three unitaries not contained in the Clifford group: 
y(^) (the ±7r/4 rotations about the Y axis) and '^iJ 
(controlled- i?). 

We use the term "Clifford operation" to refer to 
any quantum operation that can be implemented using 
Clifford unitaries together with preparation and mea- 
surement in the standard basis. States that can be 
prepared with Clifford operations are known as stabilizer 
states. Pure stabilizer states are -1-1 eigenstates of a 
complete set of commuting (generally multi-qubit) Pauli 
operators; this set of Pauli operators is known as the 
stabilizer generator. Similarly, one can define a quantum 
(stabilizer) code as the -1-1 eigenspace of a non-maximal 
set of stabilizers. 

Whenever necessary, subscripts on operators are used 
to identify the qubits that they act upon. A bar over 
an operator indicates that it is a logical (or encoded) 
operator. That is, it acts as the specified operator on 
qubits encoded in a quantum code. 

Quantum circuit diagrams in this paper conform to the 
notation used in Ref. [iS] with the following exceptions: 
Wires representing multiple qubits are not specially 
decorated, and the symbols 



and 



are used to represent the Z gate and projective mea- 
surements in the eigenbases of X and Z, respectively. 
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FIG. 1. Circuits showing that the Hadamard operator can 
be measured non-destructively using only Clifford operations 
and two copies of \H}. The circuit in (a) implements a non- 
destructive measurement of the Hadamard operator using 
the non-Clifford gate, while (b) and (c) give circuit 
identities that can be used to break the measurement up into 
Clifford operations and |-ff) resource states. The classical 
control in (c) is meant to indicate (for the -I- case) that a 
positive Y measurement triggers a,Y{^) gate while a negative 
measurement triggers the identity. 



We denote the +1 (—1) eigenstate of the Hadamard 
gateby (|-i7)), where - cos(f)|0)-f sin(f)|l) = 
y(^)|0). \H) is not a stabilizer state, so Clifford 
operations are not sufficient for its preparation; however, 
they are sufficient for its distillation [ ]. As shown 
in Fig. 1(c), \H) can be used together with Clifford 
operations to implement the non-Clifford gate Y(j), so 
\H) is a magic state. 

Distillation is the process of converting multiple faulty 
copies of a desired (resource) state into, typically fewer, 
improved copies of the state. In magic-state distillation, 
Clifford operations are used to project faulty magic states 
into a subspace. Given input states of suitable quality 
and a well-chosen subspace, successful projection allows 
one to extract higher-fidelity copies of the desired state. 
When failure to project is detected, the output states 
are discarded. It is assumed that Clifford operations 
can be implemented perfectly, which is justified in the 
commonly considered situation where fault-tolerant tech- 
niques provide highly accurate Clifford operations as a 
matter of course but rely on techniques such as magic- 
state distillation for universality. 

State distillation is facilitated by randomization, which 
can be used to simplify errors on resource states. For 
example, the twirling superoperator 



Hip) 



-HpH^ , 



(1) 



which can be implemented by applying either I or H 
with equal probability, decoheres an input state in the 
eigenbasis of the H operator. For any input state, the 
resulting state is a probabilistic mixture of \H) and \—H) 
states. That is, for some < p < 1, 

n{p)^(l-p)\H){H\+p\-H){-H\ . 
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FIG. 2. Circuits that detect whether two input states are in the subspace spanned by \H)\—H) and \—H)\H), when the input 
states are (a) unencoded and (b) encoded in the four-qubit code. Corresponding blocks of the circuits in parts (a) and (b) are 
indicated by shading. The translation from (a) to (b) relies on the fact that, for the four-qubit code, transversal Hadamard 
effects HiH2^i2- The equivalence in b) eliminates four '^H gates, reducing the number of resource states required by eight. 
In total, this figure shows that only four '^H gates are required to project the two qubits encoded in the four-qubit code into 
either the subspace spanned by {\H}\—H}, \ —H)\H}} or that spanned by {\H)\H), \—H}\—H}}. The circuits shown also apply 
an incidental Hadamard gate to the second logical qubit. Further details are given in Fig. 8 in the appendix. 



Because Y\H) = \—H), we can characterize any faulty 
\H) state that has been twirled with H as suffering from 
stochastic Y errors with some probability p that depends 
on the input state p. We assume throughout this paper 
that resource states are twirled prior to use. 

We label distillation routines by their input/output 
ratios, so an m-to-n distillation routine takes m resource 
states as input and produces n resource states as output. 

II. lO-TO-2 DISTILLATION ROUTINE 

The basic form of our routine for magic-state distil- 
lation is as follows: Resource states are encoded into a 
quantum code; these encoded resource states are verified 
through an encoded measurement; and finally the code is 
decoded, leaving, when no errors are indicated, resource 
states of better quality. The intuition behind this ap- 
proach is that one would like simply to measure whether 
a resource state is good, but doing so requires additional 
resource states whose own errors might go undetected 
in such a measurement. Errors on these states are 
rendered detectable by performing an encoded version 
of the measurement in a fault-tolerant fashion. This is 
the approach employed in reference [■!] for distilling \H) 
using the 7-qubit Steane code. The routine described 
here is instead based on the 4-qubit error-detecting code. 

As the -1-1 eigenstate of the Hadamard operator, the 



state \H) can be verified by measuring H. Measurement 
of the Hadamard operator is impossible using only 
Clifford operations, but it can be accomplished, as shown 
in Fig. 1, with the help of two additional \H) states. 

To render errors during the Hadamard measurement 
detectable, the routine first encodes a pair of faulty 
resource states into the C4 code [■!] and then performs 
an encoded measurement H1H2 on the pair. This 
measurement determines whether the pair is in the logical 
subspace spanned by \H)\—H) and \~H)\H) and can 
therefore detect whether one of the states had an error 
(see Sec. HI). 

The C4 code is a [[4, 2, 2]] quantum code defined by the 
stabilizer generator matrix: 

"X(g)X(8)X(g)Xl , , 

Z(E)Z(E)Z(E)Z\' ^ ' 

Our choices for logical X and Z operators are: 

Xi=X(g)X(g)/(8)/, 

Zi=Z(g)/(g)/(g)Z, 
X2=X(g)/(g)/(g)X, and 

Z2^Z(g}Z(g}I(g}I. 

Because any one-qubit Pauli operator anticommutcs with 
some stabilizer generator of C4, it is possible to detect any 
error on a single qubit of the code. 
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FIG. 3. Circuit identities for propagating Y errors on gate 
states, (a) The rule for propagating a Y error on a resource 
state used to apply a Y{^) gate as in Fig. 1(c). The effect 
of the error is the same as applying the Y error after the gate, 
(b) A Y error on the first resource state required to implement 
a gate using the circuits in Fig. 1 propagates to a Z error 
on the control and a Y error on the target. An error on the 
second resource state propagates trivially to a Y error on the 
target. 



The set of stabilizer generators of C4 is symmetric with 
respect to exchange of X and Z, so H^H^H(^H is a 
valid encoded gate, and for the choice of logical Pauli op- 
erators given above it effects a logical Hadamard on both 
encoded qubits followed (or, equivalently, preceded) by 
a logical SWAP. Consequently, the controlled- (-ff ^12) 
gate (the control is unencoded and the target is encoded 
in C4) can be accomplished using a sequence of four 
gates. Using this gate one can derive a circuit that 
implements the encoded measurement, H1H2, as shown 
in Fig. 2, by means of four '~^H gates implemented with a 
total of eight resource states. 

The final step of the distillation routine is to use 
Clifford operations to decode the logical qubits and 
measure the syndrome of the C4 code, leaving two output 
resource states. The routine succeeds and accepts the 
output if neither the encoded measurement nor the 
syndrome indicates an error. Otherwise the output is 
discarded. We analyze the error patterns for the full 
distillation circuit, shown in Fig. 4(a), in the next section. 



III. ANALYSIS 

Given perfect Clifford operations and twirled resource 
states, the only possible errors in our distillation circuit 
are Y errors on the input resource states. For simplicity 
we assume that the input states to be distilled are 
independent and all have the same error probability p. 

As described in the previous section, the ten input 
resource states can be partitioned into two resource states 
that are encoded into the code C4 (data states) and four 
pairs of resource states used to implement gates (gate 
states) . The effect of one error on either type of resource 
state can be understood as follows. 

A Y error on one of the data states becomes an encoded 
Y error, which flips the outcome of the encoded measure- 
ment (the measurement of H1H2) and is thus detected by 



the routine. The decoding exactly reverses the encoding, 
and the logical gates in between preserve logical Y errors, 
so errors on data states persist on the output and are not 
detected by the syndrome measurement. 

As shown in Fig. 3, a F error on one of the gate states 
causes the intended '~^H gate to act as '~^H followed by 
either Z ®Y or I ®Y , depending on which resource state 
was in error. Using circuit identities, these errors can be 
propagated to a common location just before the second 
set of '^H gates, as depicted in Fig. 4(b). At this location, 
such an error appears as a combination of some logical 
operator and a Y error on a single qubit, which is not 
an encoded Pauli operator for the C4 code. This Y error 
is followed only by logical operators, which cannot take 
an error subspace to a non-error subspace, and decoding, 
which returns a syndrome indicating whether the state 
is in an error subspace. Consequently, a single error on 
a gate state is detected by the syndrome measurement. 

The effect of multiple errors is best understood by 
propagating the errors from both gate and data states 
to two locations, as described in Fig. 4. The Y Pauli op- 
erators from any pair of errors on gate states (described 
above) combine to form a logical operator for the code, 
so any even number of errors on such states will fail to be 
detected by the decoder, while any odd number of errors 
will be detected. For each error pattern that is not de- 
tected by the syndrome, one can consider the effect of the 
logical errors on the encoded information and encoded 
measurement. For example, even numbers of Z errors on 
the encoded-measurement ancilla will cancel and cause 
the distillation to be accepted. Each pattern of errors on 
the resource states can then be classified first by whether 
it is detected and then by whether it causes a non-trivial 
logical error. Because errors on each state are considered 
to be equiprobable and independent, this enumeration 
determines the probability a{p) of the distillation routine 
accepting and the marginal error probability e(p) of an 
output state conditional on acceptance. It happens that 
e{p) does not depend on which of the two output states 
is considered. 

Based on the observation that any single error results 
in rejection, a simple estimate of the acceptance proba- 
blity is a{p) = 1 — lOp + 0{p^). The exact accounting 
yields 

a{p) = 1 - lOp 58/ - 192p3 + 400/ 

- 544p5 + 480p^ - 256p^ + Mp^ . 

The probability of an undetected error on the output 
states is the probability that the routine accepts and 
that the output nevertheless has an error. Because any 
single error is detected, this probability has order p^. 
The marginal undetected-error probability of the first (or 
identically the second) output state is 

u{p) = 9/ - 56/ + 160/ - 256p^ 
+ 240p^ - 128p^ + 32pS . 
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FIG. 4. Illustration of a method of classifying errors in our distillation routine. The first circuit shows the full distillation 
routine with possible error locations shaded. Using circuit identities, errors at any of these locations can be concentrated into 
one of two regions (and types) yielding an equivalent circuit of the form shown in (b). Any Pauli errors on the lower four (code) 
qubits in the second error region of (b) will be detected by the decoding circuit unless they act as a logical (Pauli) operator 
on the code. The remaining possible errors, those undetectable by the syndrome measurement, may then be enumerated and 
classified efficiently using the logical circuit shown in (c). 



For our purposes, this is the quantity of interest, but 
one can also compute the probability of at least one 
undetected error on the two outputs. This is given by 

U2{p) = ISp^ - 80p^ + 228/ - 368p^ 
+ 352/ - 192p^ + 48/ . 

The quality of the distillation routine's output is 
quantified by the marginal probability of error of an 
output state conditional on acceptance: 

e{p) ^ u{p) / a{p) . (3) 

The corresponding probability of at least one error on 
the two outputs conditioned on acceptance is 62 (p) = 
U2{p)/a(p). It can be shown numerically that e2{p) < 
2e(p) — e{p)^, so errors on the two output states are 
positively correlated. In fact, the probability of an error 
on both output states is of order /. 

IV. DISTILLATION SEQUENCES 

The ultimate goal of magic-state distillation is to 
produce resource states of sufficiently high quality that 
they can be used to implement all non-Clifford gates 
in a computation without significantly increasing the 
probability that the computation will fail. A generic 
computation will fail if any single gate fails, so the 
probability of one or more errors on the R resource 



states employed in a computation must be much less 
than 1 to ensure that the computation succeeds with high 
probability. By the union bound, it is sufficient that the 
marginal probability of error on each resource state be 
much less than 1/R. Strong correlations can reduce this 
requirement on marginal probabilities, but for indepen- 
dent errors the bound is necessary. Consequently, the 
proximate goal of magic-state distillation is to produce 
resource states such that the marginal probability of error 
for any single state is bounded from above by some goal 
error probability, Cg <^ 1/R. In algorithms currently 
envisoned for quantum computers, R can easily be 10^*^ 
or more. 

In order to obtain resource states with very low prob- 
abilities of error, it is necessary to use multiple rounds of 
distillation, where the input to each round is produced 
by the preceding one. We consider a sequence of such 
rounds where each is based on a single but possibly 
round-dependent distillation routine. In a round based 
on an m-to-n distillation routine, the output resource 
states from the preceding round are grouped into blocks 
of size m, and each block is then distilled to n states, 
which may, in general, have correlated errors. 

The sequence of rounds is chosen to minimize the 
number of input resource states needed to produce a 
given number of output states with marginal probability 
of error Cg or less. In practice, we are interested in the 
case where the number of resource states to be prepared 
is very large, allowing us to consider only the asymptotic 
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cost. The cost is defined as the number of input resource 
states used per output resource state produced. For one 
round of the lO-to-2 distillation routine, the cost is , 
with marginal probability of error e{p) on the output 
states conditioned on acceptance. 

The one-round expression for marginal probability 
of error given in Sec. Ill assumes that the input re- 
source states suffer from errors independently and with 
equal probability. Generally, however, the output of a 
distillation routine need not satisfy either restriction, 
which poses a concern for distillation sequences involving 
multiple rounds. If necessary, distillation routines can be 
output symmetrized by randomly permuting the output 
states, thereby ensuring that the output states from a 
given round all have the same error probability. Inde- 
pendence is a concern whenever a routine that outputs 
more than one state per instance is used, since errors on 
the states output by one instance of such a routine are 
usually not independent. For example, the probability of 
two errors in the output of the lO-to-2 distillation routine 
is of the same order as that for one error. Performing a 
distillation routine using such correlated states as input 
can substantially increase the output error probability. 
To avoid this effect, it is sufficient to ensure that no 
instance of a routine depends on more than one output 
from any previously executed instance of a routine. 
The following lemma and its corollary show that this 
strategy works without an increase in asymptotic cost. 
As a consequence we can calculate the asymptotic cost 
as if the output states of all routines were completely 
independent. 

Lemma IV. 1. Let T) he an m-to-n output- symmetrized 
distillation routine with acceptance probability a{p) and 
conditional output error probability e(p) for each output 
state. Given a block of K independent resource states, 
each with error probability p, one can produce n blocks of 
output states where each block's states are independent 
within the block and have probability of error e{p). Each 
block contains a{p) \_K/m\ states on average. 

Proof. Partition the K resource states into lK/m\ sets 
of m states, discarding any remaining ones. Apply V to 
each set of m states, getting n states with probability a{p) 
in each case. Conditional on acceptance, each output 
state has marginal probability of error e(p), though these 
errors are not independent. Form n blocks by taking 
the j^^ output state from each successful distillation, for 
j = 1, . . . ,n. These blocks have the desired properties 
for a given pattern of distillation successes. Because 
the error probabilities of the j*'^ output states of the 
successful distillations do not depend on the pattern of 
successes, any such arrangement of the output states that 
depends only on the pattern of successes preserves this 
independence. □ 

Corollary IV. 2. //, in Lem. IV. 1, K is random with 
average (K) , then the expected total number of output 
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FIG. 5. Plots of the marginal error probabilities conditional 
on acceptance for the lO-to-2 (solid) and 15-to-l (dotted) 
routines. The dashed hne indicates the output if no distil- 
lation is performed. The thresholds for the two routines are 
determined by the first intersections with this line. 

states is at least a{p)n — 1^. The average size of 
each of the n output blocks of independent states is at 
least a(p)(<^-l). 

A multi-round distillation routine can now be formu- 
lated as follows: Assume that after round / — 1 there 
are Ni-i blocks of resource states, where within each 
block the states are independent with identitical error 
probabilities pi-i, and the number of states in each 
block is {Ki-i) on average. Applying the procedure of 
Lem. IV. f to each block with a m/ — >■ ni distillation 
routine Vi yields A^; = Ni^ini output blocks, where each 

output block has (Ki) > ai{pi^i) ^ ^''^'^^ ^ 1^ resource 
states on average, independent within a block and each 
with error probability pi = ei{pi-i). The first round 
starts with Kq independent resource states, each of which 
suffer an error with probability po- For large Kq, the 
constant offsets of —1 in the expressions are negligible. 
Consequently, the asymptotic cost c/ of resource-state 
production after round I satisfies c; = nia^pi 
where cq = I. The error probability after round / satisfies 
pi = e;(p;_i). 



V. COMPARATIVE PERFORMANCE 

At present, practical fault-tolerant architectures re- 
quire physical-gate error probabilities well below .01, 
which suggests that it should be possible to directly 
prepare resource states with error probabilities of a few 
percent or less. Given such states, the most practical 
routine for \H) distillation developed to date is the 15- 
to-l routine. In this section, we compare the performance 
of the lO-to-2 routine to that of the 15-to-l routine and 
consider the effect of using them in concert. 
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FIG. 6. Output error probability as a function of input error probability for various sequences of the lO-to-2 (A) and 15-to-l (B) 
\H) distillation routines. Each curve is labeled by the associated sequence of routines, e.g., BAA denotes the 15-to-l routine 
followed by two rounds of the lO-to-2 routine. The region directly underneath each labeled curve is the region in which the 
labeled strategy is preferred. 



Routines for magic-state distillation are typically 
judged on the basis of their threshold, that is, the 
error probability below which resource states can be 
successfully distilled. At the threshold, a distillation 
routine outputs resource states no better than the inputs. 
Thus, the threshold pt for the lO-to-2 routine can be 
determined from Eq. (3) by considering solutions to 
Pt = e{pt). This yields a threshold of pt = 0.089, 
which is substantially below the threshold of 0.141 for 
the 15-to-l routine [ '], but either threshold should be 
adequate for the error regime of interest. The curves for 
the marginal output error probability of the lO-to-2 and 
15-to-l routines are plotted in Fig. 5. 

The efficiency of a distillation routine can be char- 
acterized, as detailed in Ref. [-], by the output error 
probability as a function of the number of resource states 
employed. In the limit of small initial error probability p, 
the output error probability for the lO-to-2 routine after I 
rounds of distillation is | {9p)^ . In the limit of both small 
p and many output states, I rounds of distillation require 
A; = 5' input resource states per output. Consequently, 
taking I to be continuous, the asymptotic output error 
probability as a function of the number of input resource 
states expended is ^(9p)''^, where ^ = j^^-^^^^ w .43. 
The corresponding exponent for the 15-to-l routine is 
.4, so the lO-to-2 routine performs slightly better for this 
metric. However, these smooth functions hide the step 



discontinuities induced by using sequences of increasing 
integral lengths (as seen in Fig. 7) and can be misleading 
for practical comparisons. 

Of greater utility to us is the cost, in resource states 
consumed per output state, required to obtain resource 
states of sufficiently high quality for useful quantum com- 
putations, given resource states with error probabilities 
in the range of 0.01 to 10~^. The cost depends on 
the distillation sequence, which generally entails multiple 
rounds of distillation. For the purpose of optimizing the 
distillation sequence, one can consider arbitrary routines 
at each round. Here, we consider sequences involving the 
lO-to-2 and 15-to-l distillation routines. 

In Fig. 6 we plot the output error probability as a 
function of input error probability for various sequences. 
Data for the 15-to-l routine was computed using the 
expressions corresponding to a{p) and e{p) in Eq. (35) 
and Eq. (36) of Ref. [2]. In the region plotted, distillation 
sequences with higher output error (lower curves) also 
require fewer input resource states per output state. 
Consequently, for a given output error goal, Cg, and input 
error probability p, the label of the nearest curve above 
the point {p,eg) in the plot gives the best distillation 
sequence involving the lO-to-2 and/or 15-to-l routines. 

Table I shows the costs and improvements for a number 
of distillation sequences given an initial error probability 
of p = 0.01. Not surprisingly, the table shows that 
the lO-to-2 routine has a smaller cost, but the 15-to-l 
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FIG. 7. Log-log plot of the cost required to produce output states satisfying a goal error probability (eg) given input states with 
error probability 0.01. The upper horizontal segments show the cost for the best sequence involving only the 15-to-l routine. 
The lower horizontal segments show the cost for the best sequence of routines that achieves eg or better. The gray regions 
indicate the improvement obtained using the lO-to-2 routine. 



routine has greater improvement in error probability per 
round. For distillations that use both routines, we find 
numerically that if 15-to-l rounds are used they should 
be placed first. This is intuitively consistent with the 
higher threshold for the 15-to-l routine, which suggests 
better performance at high error probabilities. 

The cost improvements shown in Tab. I are illustrated 
more visually in Fig. 7, which shows the production cost, 
at a fixed input error probability p — 0.01 and as a 
function of Cg, of the best distillation sequence compared 
to the best sequence using only the 15-to-l routine. 
For example, a goal error probability near 10^^ can be 
achieved by using either two rounds of the 15-to-l routine 
at a production cost of 261.7 or two rounds of the lO-to-2 
routine at a cost of 27.9. In this case, the improvement 
in production cost obtained by incorporating the lO-to-2 
routine is a factor of 9.4. 



VI. CONCLUSIONS 



Distillation 


Cost 


Output error 


Cost improv 


scheme 




probability, e(p) 


factor 


A 


5.5 


9 X 10-^ 


3.2 


B 


17.4 


4 X lO"'^ 


1 


AA 


27.9 


7 X lO"*^ 


9.4 


BA 


87.2 


1 X 10"* 


3.0 


AAA 


139.3 


5 X 10"^" 


1.9 


BB 


261.7 


2 X IQ-^^ 


1 


BAA 


436.2 


1 X IQ-^^ 


9.0 


AAAA 


696.6 


2 X IQ-^** 


5.6 


BBA 


1308.7 


2 X IQ-^^ 


3.0 


BAAA 


2180.8 


1 X 10"^^ 


1.8 



TABLE I. Costs and output error probabilities at p = 0.01. 
The labels for the distillation schemes follow the convention 
given in Fig. 6. The cost improvement factor is with respect 
to the shortest sequence using only the 15-to-l routine that 
achieves at least as good an output error probability. 



Magic-state distillation enables universal quantum 
computing given only mediocre copies of a non-stabilizer 
state and high-quality Clifford operations. Considering 
the importance of Clifford-based techniques to the theory 
of fault tolerance, we expect that magic-state distillation 
will prove valuable for the practical implementation of 
quantum computers. 

At the logical level, computationally useful quantum 
algorithms involve many non-Clifford gates, generally 
enough to account for a significant fraction of all gates 



employed. At least one high-quality magic state is 
required for the indirect implementation of each non- 
Clifford gate, so it is important to minimize the resources 
needed for the distillation of such states. 

In this work, we contributed to the goal of resource 
reduction by introducing an \H) distillation routine that 
reduces the error probability for faulty \H) states from 
p to 0{p^) and produces 2 output states using 10 input 
states. By judiciously combining this routine with the 
higher-order {p to 0{p'^)) but higher-cost 15-to-l routine 
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from Refs. [2, 3], we showed that the number of faulty hkely lead to further improvements. 
\H) states required to distill states of a given quality can 
be reduced by up to an order of magnitude. Inclusion 
of additional distillation routines in the analysis would 
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Appendix 

Fig. 8 provides additional details about the circuit 
identities used in Fig. 2. 
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FIG. 8. Additional details regarding the relations between circuits in Fig. 2. (a) A sequence of circuit identities deriving the 
form of the encoded Hadamard gate used in Fig. 2(b). The starting circuit implements H on the second logical qubit of the code 
C4 by decoding the logical qubits into the first and third physical qubits, applying H to the third qubit, and re-encoding. The 
first equivalence is obtained by commuting and cancelling pairs of '^X gates. The second equivalence uses the decomposition of 
the gate into '^Z and H gates several times as well as the identity = I. The final equivalence uses two facts: ZX = iY 
and the order of a '^X and a ^Z gate with the same target can be exchanged if a '^'Z gate is added between the controls, (b) A 
sequence of circuit identities showing why the pair of '^H gates targeting the fourth qubit in Fig. 2(b) can be eliminated. Other 
than reorganization of commuting gates, these equivalences rely on the fact that, because H anticommutes with Y, and ^Y 
gates with the same target can be exchanged if a '~^Z gate between the two controls is added. 



